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I Abstract 

Time scale separation is a natural property of many control systems that can be ex- 
ploited, theoretically and numerically. We present a numerical scheme to solve optimal 
control problems with considerable time scale separation that is based on a model reduction 
D approach that does not need the system to be explicitly stated in singularly perturbed form. 

We present examples that highlight the advantages and disadvantages of the method. 

I — I 1 Introduction 

u 

Optimization based control in practice depends on accurate models with small prediction error 
T and computation of a (feedback) control that is close or at least consistent with the true op- 

timal control for the process under consideration Often the desired accuracy can only be 
2 provided by nonlinear large scale models which leads to problems in online control, for example 

y via nonlinear model predictive control (NMPC) , [51 13] due to the computational demand. Model 

reduction therefore plays an important part in the development of control systems, see the pa- 
I per by Marquardt [23J who gives a concise review of model development and model reduction 

^ techniques. 

On Model reduction can be divided into model order reduction which aims at decreasing the 

dimension of the state space and model simplification which tries to simplify the evaluation of 
the model equations. Both approaches can be combined and essentially strive to capture the 
most important features of the dynamic process at the cost of an error in the reduced compared 
to the full model. The trade off between lost accuracy and benefits of the reduced model always 
has to be considered and carefully balanced depending on the application at hand. 

> 2 Model Order Reduction 



X 



In the remainder of this article we will only refer to model order reduction. Therefore we introduce 
the system 

z = f(z,u), z(0) = Zq (1) 

with state z{t) € M" and control u{t) € M™. The right hand side / : R" x ^ M" is assumed 
to be in C°° . A general approach to model order reduction can be summarized in the following 
steps 
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1. Find a difFeomorphism T : M" M" that maps z via 

i-r = T{z) ^ z = z* + T-^{S), 

onto the new state z{t) € M", where z* is a possibly nonzero set point. The aim of this 
coordinate change is to separate directions in the phase space of ([T|) that have strong 
contributions to the dynamics from those that only contribute in a minor way. 

2. Decompose the new state space into x{t) e MP and y{t) E such that z — {x,y)'^ and 
n = p + q. Here x will play the role of the dominant states. 

3. Assemble new dynamic systems for x and y from 

i = (D,T(z))-i/(r +T(2),u) 

and obtain 

i^f{x,y,u), a::(0) = a;o, 
y = 9{x,y,u), y(Q)^y„. 

The smoothness of the right hand sides / : x x ^ and 5 : x M"? x M" E.P 
is determined by the smoothness of T and T^^. 

4. Eliminate the dynamic equation for y by one of the following methods: 
Truncation: Set y ^ for the reduced dynamic system 

i = /(i, 0, u), x(0) = xo 

with x ~ X and state space dimension m. 
Residualization: Set y = to obtain the differential-algebraic system 

x^f{x,y,u), x{0)^xo, 
= g{x,y,u). 

The dimension of the model is not reduced. 

Slaving: Obtain a map y — (j){x,u) either from the residualization approach by solving 
the algebraic equation explicitly or through an independent method. Using 

X = f{x, (l){x,u),u), x{0) = xo 

leads to a reduced model with state space dimension m. 

Several model order reduction methods have been proposed in the past and we proceed to 
give a short description of some of them. 

Nonlinear balancing is an analytical method based on the theory of nonlinear Hankel operators 
and their attributed singular value functions aimed at obtaining a nonlinear map T In 
practice, empirical balancing that incorporates samples of the systems behavior for different 
inputs and initial values can be used j ilOj . In that case the map T is linear. 

Proper orthogonal decomposition (POD) is based on sampling representative trajectories of 
([1]), called snapshots [13[. Similarly to balancing a linear transformation matrix T is obtained 
using singular value decomposition. For a focus in the context of optimal control see |15l. 

Additional approaches include combinations of balancing and POD [TB] and moment matching 
for nonlinear systems fT\. 
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2.1 Slow Invariant Manifolds 



Our approach to model order reduction is based on time scale separation which is a frequent 
feature of complex dynamic processes. A theoretical environment for such system is provided in 
singular perturbation theory [llj where we deal with a system of the form 

i ^ f{x,y,u,e), x(0)=^(£), 
= 9{x, y, u, e), y{0) = ri{e). 

The parameter e is assumed to be small (0 < e <C 1) and reflects the time scale separation. The 
fast modes y evolve on the time scale 0{e^^) whereas the slow state dynamics are 0(1). We put 
forward the following assumptions: 

Al The involved functions /, g, ^, and ry are at least R + 2 times continuously differentiable 
with respect to their arguments on their respective domains of interest. 

A2 Let e = in ([2|, then the reduced system is given by 

i = f{x,y,u,0), x{0)=m, 
= g{x,y,u,0), y(0) = 77(0). 

There exist solutions x{t) = xo(i) and y{t) ~ yo(0 of the reduced system for t e [0,T]. 
A3 The Jacobian 

gy^-Dyg{xoit),yo{t),u{t),0)eR'''"' 

has I < k < q eigenvalues A^, i — 1,2, ... ,k with 3?(Ai) < — /i and q — k eigenvalues with 
3?(Ai) > /i, i — k + l,k + 2,...,q where /i > 0. In other words the Jacobian has no purely 
imaginary eigenvalues and there is at least one stable direction. 

From the third condition it follows that gy is nonsingular, so the algebraic equation = g{x, y, u, e) 
can (at least locally) always be solved with respect to y. Let the assumptions A1-A3 hold, then 
there exists ([TT], Theorem 1) an Eq and a /c-dimensional manifold S{e) such that the solution of 
([2| can be expanded into series representations for £ < eo provided that 77(e) C S{e): 

x(t, u, e) — x*{t, u, e) + X{t/e, u, e), 
y{t, u, e) = y* {t, u, e) + Y{t/e, u, e) 

with 

i?, R 
x*{t, u,e)^Yl ")^'' + 0{e^^'), y* {t, u,e) = Y^ y* {t, u)e^ + 0(e«+i), 

R. R 

X{t/e, u,e)=Y^ Xr{t/e, u)e'' + 0{e^'+^), Y{t/e, u, e) = ^ Yr{t/e, u)e'' + 0{e^+^). 

The fast motions are captured in the so called boundary layer corrections X{t/ e, u) and Y{t/ e, u) 
which converge to exponentially fast. For the purpose of model order reduction we neglect the 
boundary layer correction and focus on the slow or outer solution x*{t,u,e). A central result is 
the following that goes back to Fenichel [7J , see also [121 , and is related to the geometric singular 
perturbation approach to the problem. 
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Theorem 1 ([12j, Theorem 2.1, Fenichel, asymptotically stable slow manifolds). Let assumptions 
A1-A3 hold. Then, for any sufficiently small e, there is a function h that is defined on a compact 
domain K dW^ x M™ such that the graph 

Me = {{x, y)\y^ h{x, u, e), {x, u) G K) 

is locally invariant under The function h admits an asymptotic expansion, 

R 

h{x, u,e) = J2 f^r{x, + 0(e'^+^). (3) 

The manifold Aie is also called slow invariant manifold (SIM). Utilizing h(x,u,e) we can 
reduce the system ^ to 

X* = f{x*,h{x*,u,e),u,e), x{0) = ^(e), 

in accordance to the slaving approach introduced earlier. In practice only an approximation to h 
can be feasibly computed. Using Hq corresponds to setting e = in Q and solving the algebraic 
system for y, the fast states are assumed to be relaxed immediately. 

Using an explicit formula or the SIM for model order reduction presumes that the system 
is given in singularly perturbed form and therefore the method skips step 1 of the algorithm 
discussed earlier. For general systems ([T|) a nonlinear coordinate transformation T would have to 
be explicitly known to apply the outlined theory. For systems where the small parameter e can 
be identified, conditions on when such a transformation exists and how it might be constructed 
are given in |22] . Otherwise a state space decomposition into fast and slow modes has to be based 
on physical insight or numerical methods. Among them are eigenvalue analysis of the linearized 
system or singular value analysis of sensitivity matrices jl8j. 

2.2 Approximation of the SIM 

To approximate the SIM, in general only a numerical procedure will be feasible either because 
the analytic computations to obtain the coefficients for the asymptotic expansion ([3| of /i can 
not be carried out explicitly or the system can not be transformed to the singular perturbed 
form. In this section we therefore regard the general system 

i = f{x,y,u), 
y = 9{x,y,u). 

We assume that the state space decomposition into fast modes x and slow modes y is already 
carried out either by employing a priori knowledge or by using one of the methods mentioned 
above. Our approach |21l 1171 1281 119j is based on optimization of trajectory pieces or points in 
the state space where the slow variables are fixed at a certain point in time t* and the according 
fast states are computed in dependence of the slow variables, i.e. the SIM is parametrized by 
the slow states and can represented by a function smooth function y — h{x,u). The underlying 
idea is that the fast states will relax onto the SIM as fast as the system dynamics possibly allow 
and then stay on it. A parametrization for u also has to be chosen. We will use a piecewise 
constant control function later in a multiple shooting approach to solve the optimal control 
problem, therefore we will use u = const here. The optimization problem for the computation of 
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a reduced model is 



min $(a;, y, u) 



V 



subject to: x — f{x, y, u) 

y = g{x,y,u) 

x{t*) = X*, 



u — u 



(4) 



Let F be the full right hand side vector, hence 



^ ^ \g{x,y,u)) 



and J be the Jacobian of the full dynamic system with respect to x and y, i.e. 



J = ^x,yFix,y,u) 



For $ we will either use 




(5) 







or 



<^{x,V,u) = \\JF\\l. 



(6) 



In the first objective the value of x is fixed at the end of the integration interval because the fast 
modes are unstable in backward time. This means that trajectories starting at points y* = y{t*) 
that are not on the manifold will exponentially move away from it and thus a large contribution 
to the objective function is created. In the second case the dynamic equations are no longer 
constraints of the optimization problem. 

Remark 1. || J_F'||2 in both objective functionals can be linked to minimizing curvature in the 
phase space [12] and a variational principle jH] . There is a relation to the zero derivative principle 



Additionally, for the application of h{x,u) in optimal control we also need at least first order 
sensitivities D^h{x,u) and Dy^h{x,u). However, they can be easily obtained from the KKT 
system at the solution point y* of the optimization problem Q 

Numerically, for the integral based objective ^ either single shooting or collocation is used to 
obtain a nonlinear program (NLP) which is subsequently solved with an interior point method 
implemented in the software package IPOPT [31]. The local formulation (|6| is solved with a 
general Gaufi-Newton method specifically tailored to the problem at hand f3D]. In all cases 
warm starts are employed to improve convergence of the optimization if the problem has to be 
solved for a series of fixed values {x* , u*). 



Singularly perturbed optimal control problems have been studied extensively in the past |14l 
[Ml IMl E] • The main idea is to decompose the full system into a slow and a fast part and infer 
properties and solutions of the full problem from the independent analysis of the two subproblems. 
The linear case is well understood, but for nonlinear systems the situation is much more difficult 
and only for special cases useful explicit results can be obtained. To highlight some of the 
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problems and what we can expect at most from our approach we review shortly the nonlinear 
state regulator problem [37]: 



Jo 

subject to: x = f{x,y,u,e), x{0) = xq, (7) 
£y = g{x,y,u,e), y{0) = yo, 
x{t) e W, y{t) e E«, u{t) e E™. 

For convenience all functions are supposed to be functions of their arguments on any domain 
of interest. The £y{l) in the final time cost avoids E to depend on the fast variable y for e = 0. 
Using the Pontryagin minimum principle with the Hamiltonian 

n{x,y,X^,Xy,u,e) = /o + X^f + X^g 

we get (additionally to the primal dynamic equations) the following ODE system for the adjoint 
variables \x and A^: 

A, = - n, A,(l) = E{x{l), ey(l), e), 

(8) 

sXy^-'DyH, ^ e'DyE{x{l),ey{l),£). 

We note that there are no restrictions on the value of the control u thus D,j H = is a necessary 
condition for a minimum to occur. Moreover we assume the strong Legendre-Clebsch condition, 
D„jj Ti, is positive definite, to hold. In that case a (locally) optimal control u{t) that minimizes 
the cost functional i{u) exists [4 for e > 0. It also allows to solve D^^ "H — (locally) for 
u — Lij{x, y, Xx, Xy, e) and replace it in ([t]) and (|8|. We now have a singularly perturbed boundary 
value problem. The main challenge with problems of this type is to determine a reasonable 
reduced problem, i.e. setting e = in ([T]) and (|8|. In general not all boundary values can be 
satisfied and some of them have to be relaxed. In this case the choice is obvious: y{0) and 
Xy{l), the boundary values associated with the fast modes can not be fulfilled because for e = 
their values are determined by algebraic equations. Even if a reduced problem can be stated, 
for nonlinear problems it is not generally possible to postulate conditions for a solution to exist. 
Besides the smoothness assumptions the following restrictions are necessary. 

Al' The reduced problem 

i ^ f{x,y,i^,0), x{0)=xo, 
A, = - n{x, y, A,, Xy,uj, 0), A,(l) = ^(x(l), 0, 0), 
= g{x,y,uj,0), 
= - 'Dyn{x,y,Xx,Xy,^,0), 

has a unique solution x'^{t), y°(i), A|^(i), and Xy{t). 
A2' The Jacobian 

evaluated along x'^it), y*'(t), X^t), and Xy{t) has no purely imaginary eigenvalues, moreover 
we require it to have exactly q eigenvalues with positive and q eigenvalues with negative 
real part. 
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For the problem at hand it turns out that T-Ly is block diagonal with two identical blocks with 
opposite sign and symmetric and therefore the second condition on T-Ly is automatically fulfilled 
if all eigenvalues have nonzero real part. The second condition guarantees the solvability of the 
algebraic part of the reduced system and the stability of the boundary layer corrections. If Al' 
and A2' hold, the solution of the full problem converges to the solution of the reduced problem 
for £ — > and the following series representations can be stated |27l [TT| : 

x{t,e) = x*(t,e) + XL{t/e,e) + XR{s/e,e), 
yit,e)=y*it,e)+YLit/e,e)+Yii{s/e,e), 
u{t,e) = u*{t,e) + UL{t/e,e) + UR{s/e,e), 
Jiu) ^j*it,e) + JL{t/e,e) + jR{s/e,e), 

with s = 1 — t. Boundary layer corrections emerge at both ends of the time interval and 
appropriate series representations can be found to all right-hand-side terms in (|9| . The eigenvalue 
condition on Hy is necessary for the stability of the left and right hand boundary layer corrections 
to be stable in forward backward time, respectively. 

The manifold h{x,u,e) (|3| is an intrinsic property of a singularly perturbed system and it 
exists independently from the use of the system as constraint of an optimal control problem. 
Hence it can be used to reduce the dimension of the optimal control problem ^ by replacing y 
and we find 

iniiij{u) = E{x{l).eh{x{l),u{l).e),e) + / fo{x,h,u,e)dt 
" Jo 

subject to: x = f{x, h, u, e), x{0) ~ xo, 
x{t) e RP, u{t) e M". 

Its solution will correspond to x*{t,e), u*{t,e), and j*{e) which means we have no way of ob- 
taining information about the boundary layer corrections in this case. If the manifold is only an 
approximation of order k with respect to its e series representation the solutions of the reduced 
problem will also be approximations of order k. 



4 Numerical Solution of the Optimal Control Problem 

We will provide a short overview of the numerical methods we use to solve general optimal control 
problems of the type 

min j{x,u,p) 

subject to: x ~ f{x, u, r) 

e{t, x, u, T, r) = 
i{t, X, u, T,r) < 

where x{t) G is the state, u{t) € R'' is the control, T e M is the final time, and r e 
are parameters. We do not discuss the solvability of the problem and conveniently assume that 
local solutions exist. In order to solve the problem numerically we have to discretize the control 
and state functions u{t) and x(t), respectively. For this purpose we use multiple shooting [3J, 
which means we divide the overall time interval [0,T] into N subintervals with node points i^, 
k = 0,1, . . . , N, tk < ifc+i, ^0 = 0, = T. On each interval the control is kept constant with 
values Uk G M™. This constant input is used to solve the ODE for x{t) on each interval with the 
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help of a numerical integration routine. Introducing x^, k = 1,2,...,A^ as initial values for the 
dynamic equation on each interval with states Xk(t), t € [tk-i, tk] and Uk as value of the control 
function we can formulate a finite dimensional nonlinear program 

N 

min S2j{xk,Uk,p) 

subject to: Xk = /{xk, Uk,p), k^l,2,...,N, 

Xk{tk) - xl^^ ^ 0; /c = l,2,...,iV-l, 
e(t,Xk,Uk,T,p) =0, t e [tk^i, k], fc = 1,2, ... ,7V, 
i{t,Xk,Uk,T,p) < t e [tk-i,k], /c = 1, 2, . . . ,iV. 

The equality constraints Xk{tk) — x^j^i ensure the continuity of the solution at the multiple 
shooting nodes. The method is implemented as a C++ program using IPOPT (3^ for solving the 
NLP, CppAD, a tool for automatic differentiation j2j for obtaining accurate derivatives in the 
NLP as well as for a BDF integrator f32] that is used to solve the ODEs on the multiple shooting 
intervals and provide sensitivities. 

5 Evaluation of the Manifold 

Eventually, we have to evaluate the manifold map y = h{x,u) for arbitrary points {x,u) C 
MP X M"*. We do not regard e as argument of h here, since it is either a fixed parameter for the 
numerical solution of the optimal control problem in case of singularly perturbed problems or we 
regard general problems without explicit dependence on a small parameter e. We will discuss two 
alternatives: Solving the model reduction problem online, i.e. whenever an evaluation of h{x, u) 
is needed while solving the optimal control problem, and interpolation of offline precomputed 
data obtained by evaluating h{x,u) on a discrete set of points C C M'' x M™. Both methods 
have intrinsic advantages and disadvantages. The online method can be easily applied since no 
preparation steps have to be taken. However, calculating u) is costly and involves the solution 
of a NLP which could slow down the overall computation. On the contrary the interpolation 
approach provides a direct, fast, and easy to evaluate object, that promises a larger speed up. 
However, it suffers from the need to precompute the manifold data and building the interpolation 
object, both tasks take a considerable amount of time. This makes this approach only effective if 
the optimal control problem has to be solved very often (e.g. in NMPC) so that the time spend 
in the preliminary stages is outweighed by the overall performance gain. Furthermore, especially 
for higher dimensional problems, the storage needed for the interpolation data might be to large 
to be handled properly for given hardware resources. 

5.1 Online Evaluation 

If an evaluation of h{x,u) is needed for a certain {xq,Uq) we solve the model reduction problem 
Q with the slow states and control fixed to {xo,uo). For performance reasons only the local 
formulation ([6| is feasible because integration of the full model is dispensed with and a general 
Gaufi-Newton method can be used for efficient solution of the minimization problem [31] [SUl HD] . 
Although it seems that this approach contradicts the purpose of model reduction, since the full 
right hand side still has to be evaluated, there is a computational advantage due to the decreased 
stiffness of the reduced model. In addition the points at which h{x,u) has to be evaluated will 
typically be close to each other which makes it possible to use warm starts for the Gaufi-Newton 
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procedure. Thus, in general, only very few iterations will be needed to solve the model reduction 
problem. This is in principle similar to solving an explicitly given differential-algebraic equation 
where usually inside the integration routine in each time step only a few iterations of a nonlinear 
equation solver are needed for the algebraic part of the dynamic problem. 



5.2 Interpolation 

The interpolation approach is based on an interpolation function h(x, u) that will be used instead 
of a pointwise computation of h{x,u). We choose radial basis function (RBF) interpolation 
|36j because it is independent of the dimension of the input space and configuration of the 
interpolation nodes (i.e. grid free). The following short presentation of the subject is strongly 
based on Given a set of nodes C = {xk}k=ii C C O C M™, radial basis interpolants are of 
the form 

N 



fe=i 

with basis functions $ : M™ x M'" — > M. The domain C M™ is assumed to be open and 
bounded and satisfy an interior cone condition. The interpolation is carried out by determining 
the coefficients Xk such that 

Lkis) ^ Lkif) ^ fk, AeM, (10) 

holds, where f,sEH are functions, H is a, Hilbert space of functions M'^ and Lk G H* are 

linear functionals from the dual of H. The following property of a function $ is central. 



Definition 2 (Positive definite function). A continuous function $ : M™ x M™ — >• M is said to 

1^=1' 



be positive definite if for all G N, all sets of pairwise distinct nodes C — {xk}^^i, and all 



a e \{0} it holds that 

N N 

e=i k=i 

Remark 2. Positive definite functions ^{x,y) are also known as kernels and give rise to repro- 
ducing kernel Hilbert spaces, a topic we do not want to pursue any further here, see the book by 
Wendland 1361 for more details. 



In the most simple case of pointwise evaluation as the functionals in ( 10 ), i.e. Lfc(s) = Sx^s — 
s{xk), the interpolation condition becomes 

N 

Sx,s = s{xe) = ^Xk^{xi,Xk) = SxJ = fi 
fc=i 

and we obtain the linear system 

AX = F, Ae^k^'^ixi^Xk), A = (Ai A2 ... Aa,)'^, F (/i /a ... /at)^, 

with A positive definite since a^Aa > holds for all a G \{0} by definition. That guaranties 
a unique solution to the interpolation problem for all sets of pairwise distinct nodes. 

The same approach can also used for Hermite interpolation where we interpolate not only 
the function value itself but also partial derivatives. The relevant functionals are thus given 
by 

ife = <5,.oD"S fc = 1,2,...,7V, afcGN^ 
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where the differential operator D"'' indicates concatenated derivatives according to the multi- 
index ak- fn general we demand that Xk 7^ xi or 7^ ai for k ^ i to guarantee linear 
independence of the functionals and therefore a unique solution to the interpolation problem. 
With our model reduction procedure for each interpolation node the function value and all partial 
derivatives of first order can be computed. If the interpolation nodes are chosen pairwise distinct 
than the linear independence follows. The interpolant is given by 

N 

The subscript 2 of the differential operator indicates differentiation with respect to the second 
variable. The interpolation matrix has entries of the form 

and it can be shown that for positive definite and sufficiently smooth $ it is again also positive 
definite and thus the interpolation problem can be uniquely solved. 

In practice univariate radial basis functions are commonly used, i.e. :^(j){\\x~ j/Hj)- 

We will use the Gaussian function 

<^(r) = e-'='''', ceM,c>0. 

The parameter c is called the shape parameter. It plays an essential role for the interpolation error 
and the stability of the interpolation process by virtue of the fact that it is strongly connected 
to the condition of the interpolation matrix. In the case of pointwise interpolation the entries 
of A are e~'^ W^e-^^h^ i,k = 1,2, N which, for c — >■ 0, will converge to 1 for all xi, Xk € M™. 
Conversely, for c — )■ 00 the basis function cf) will either converge to if ^ Xk or to 1 if a;^ = Xk- 
This means the matrix A tends to being singular in the first case and becoming the unit matrix 
in the second case. The interpolation error \\s — f\\ is also subject to the same trade of principle. 
Determining a "good" c is crucial for the performance of the interpolation. Therefore we use 
the algorithm suggested in which is based on a computational favorable reformulation of a 
leave-one-out optimization scheme. 

Since we want to use the interpolation in the optimization we are interested in a fast eval- 
uation. We use a partition of unity approach to divide the domain of interest into smaller 
subdomains and thereby bound the computing cost. To be more precise we look for a over- 
lapping covering of by open and bounded sets j — 1,2, ... , M and continuous functions 
ojj{x) : ^ K such that 

M 

LUj (x) = 1, \/x G n and LOj{x) — 0, Vx ^ Qj. 

i=i 

Assuming that we have a feasible covering {Qj} we can build local interpolants Sj{x) for each 
^Ij. Additionally let 

l{x) = {j\x e n,} 

be an index function that returns the indices of the patches a point x is contained in. A global 
interpolant is then simply given by 

s{x) = H Wj(a;)sj(x). 
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Under certain conditions on the covering and the ujj it can be shown that the global interpolant 
enjoys the same error rates as in the naive global approach. 

Two key ingredients are needed to really take computational advantage of the method: Parti- 
tion 51 in a way so that all Vlj contain about the same amount of node points and have the index 
function I{x) be 0(1) in terms of computing time, i.e. the cost of finding the patches a random 
point lies in does neither depend on the overall number of centers N nor on the number of patches 
M. In that case evaluation is 0(1) also, since for any number of nodes we can partition fl in 
a way that the number of points in a patch is below a certain constant threshold which means 
that the sums, that have to be evaluated for each patch have a constant number of terms which 
together with the constant time look-up leads to constant evaluation time. 

If we assume the node points Xk to be uniformly distributed in one practical way to achieve 
both aims is to use a fixed- grid structure which consists of axis parallel overlapping boxes. 
Because of the parallelism many operations, like index querying can be independently performed 
in each dimension. Given a set of nodes C, overlap factor 7 e (0,0.5), and a lower bound for 
the average number of points in one box the number of boxes and their border coordinates can 
be computed. If o^, i ~ 1, 2, . . . , d is the length of a box in dimension i the overlap factor 7 
determines the fraction of oi by which two boxes overlap in dimension i, 7 < 0.5 ensures that a 
point X can only be in maximal 2 boxes per coordinate direction. 

The last missing part are the Wj. We choose dimension independent radial polynomials as 
suggested in [35] for the same purpose. They are given by 



\'pohj {x) X £ rij , 
I else, 



ujj{x) = 
with p : M ^ M, 

p{r) = -6r^ + 15r^ - lOr^ + 1 

a polynomial that fulfills the spline like conditions p(0) — 1, p{l) — 0, 0*^^(0) — D'^p(l) = 0, 
/c = 1,2 and bj : ^ [0,1], 

where P e M™ and G M™ are the lower left and upper right corner coordinates of the box 
Uj, respectively. The polynomial p gives rise to a two times continuously differentiable global 
interpolant s{x). 

Besides function evaluation we at least also need first order partial derivatives during the opti- 
mization procedure which are computed by differentiating the interpolation function symbolically 
with respect to x^. 

6 Results 

We present some numerical examples. Computation times given were obtained on an Intel Xeon 
E5620 (2.40 GHz) machine running 64 bit Debian Squeeze. 
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6.1 Enzym Kinetics 

The first example is based on a simple Michaelis-Menten enzyme kinetics in singularly perturbed 
form [23]. The (full) problem is 

min / ~50y + u^dt 



Jo 

subject to: x = —x + {x + 0.5)y + u, (11) 
ey = X — {x + 1.0)?/, 
xiO) = l, 2/(0) =2/0. 

The control and the objective are artificial and not related to a realistic model scenario. The 
reduced problem is obtained by replacing y with h{x,u), eliminating the ODE for y and the 
initial condition yg. The reduced problem is thus 



mm 



5 

-50/i(x,m) + u^dt 







(12) 

subject to: x = —x + {x + 0.5)h{x, u) + m, 
x(0) = 1. 

We set e = 10^^ and use the multiple shooting approach described above for numerical solution. 
To obtain initial values for x and y at the multiple shooting nodes we integrate the uncontrolled 
system {u(t) — 0) numerically starting at x{0) = and y{0) = j/q- The overall time interval is 
divided into 40 equidistant multiple shooting intervals. The termination tolerance for IPQPT was 
set to 10^** and the integration tolerance of the BDF-integrator to 10^^. The discretized full 
problem has 204 variables whereas the reduced problem has 163. Lastly, bounds are introduced 
for the state variables and the control. We use the lower bounds xi — yi ~ ui — and the upper 
bounds Xu = yu = Uu = 5.5 on all multiple shooting nodes. 



The performance of the full system (111 depends on the initial value yo- For yo ~ we 
have an average runtime of 2.5 s and 26 NLP iterations. For yo — 0.5, which is the first order 
approximation ho{l,0) we find 2.1s for 22 NLP iterations and lastly for yo — 1 we get 2.2s and 
22 iterations. 

Next we used the online computation of h. The algorithm clocks in at 2.1 s and 22 iterations, 
which means no performance gain compared to the full problem. However, the number of NLP 
iterations in the subproblem of approximating the manifold is interesting. The maximum is 4 
iterations but 60% of the calls terminate after 2 and 37% only after 1 iterations. The call to 
the model reduction routine is by now done through an external library which produces a lot of 
overhead, for example in terms of right hand side function evaluations. Tight integration into 
the BDF integration algorithm might lead to a significant speed up. 

For the interpolation approach one first needs to choose a reasonable set of nodes. Although 
the RBF method is grid independent for convenience we used a Cartesian grid on [—0.5, —0.5] x 
[10, 10]. The performance of the interpolator depends of course on the number of points but also 
on the number of points per patch and overlap in the partition of unity approach. To assess the 
influence we scatter searched the region {20, 30, 35, 40} x {0.025, 0.05, 0.1, 0.15} x {5, 10, 15} for 
points in each direction, overlap and points per patch respectively. Median runtime was 0.87 s 
and median number of NLP iterations 24.5. Moreover, the fastest combination took 0.61s and 
only 21 iterations compared to the slowest which needed 1.4 s and 23 iterations. It is apparent 
that using the interpolation approach in this case is beneflcial from an performance point of view. 
In the best case it is nearly 4 times faster than the full problem. 
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Table 1: Summary of various statistics concerning the solution of problem and ( |12[ |. The steps 
statistics refer to the integration and are the sums over all NLP iterations and multiple shooting 
intervals. 



Problem Time NLP iter Time per iter Steps Rej. steps 

lllf 2l5s 23 OTs 43165 7369 

(121 online 2.1s 22 0.1s 5520 1281 

(121 offline (median) 0.9s 24.5 0.04s 

( 12 1 offline (best) 0.6 s 21 0.03 s 5194 1236 




Figure 1: Example trajectories using the con- Figure 2: Example controls computed from the 

trol from the full problem (111, subscript / and full problem (111, subscript / and the reduced 

the reduced problem ( 12 1, subscript r. Both tra- problem ( 12 1, subscript r. Both controls overlap, 
jectories overlap. 



The main reason for the speed up is not so much the reduction in the number of optimization 
variables but mainly the reduced stiffness along the manifold h. In the integration routine larger 
step sizes are possible which greatly reduces the computational effort. This especially pays off in 
the multiple shooting approach since the initial values at the multiple shooting nodes are subject 
to optimization and they might be set away from the SIM for the full system in each iteration of 
the NLP solver leading to transient behavior of the fast trajectories on each interval and therefore 
forces the integrator to use small steps. An overview of example integrator statistics is given in 
table [T] as well as the result of the performance tests. 

If we use the computed optimal control from the reduced system as input of the full system 
we obtain virtually the same objective values as for the control computed with the full system. 
In this example the error of the reduction is negligible which can also be concluded from the 
system itself. Example trajectories and controls are plotted in figure [l] and [2] respectively. In 
both plots the subscript / refers to the results obtained using the computed control from the 
full system (111, whereas r refers to the results using the control computed from the reduced 
problem ( 12 ). 
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6.2 Voltage Regulator 

The next example is taken from [25], example 4.2. It describes a voltage regulator with 5 states 
governed by a set of linear ODEs. The problem is given by 

1 

min - / x\ + u^At 







11 
subject to: xi = — xi + -X2, 
5 2 

1 8 



^^--r^^-.y^^ (13) 

5 30 
eyi = --yi + yy2, 

5 15 
£2/2 = -|y2 + ^ya, 

1 3 

£2/3 = -2^/3+ 

The problem, although in essence linear-quadratic, has some interesting features: The coupling 
between the slow and fast subsystems is only through the one fast state yi which means that 
only one state has to be reproduced during the optimization, i.e. we are only interested in 
j/i = h(xi, X2,u). The reduced problem is 

1 /"^ 

min - / x'i + u^dt 



2 







subject to: xi = ^g'^i 2^^' ^^^^ 

1 8^, 
X2 = --a;2 + -h{xi,X2,u). 

First, we set e — 0.2 and solve both problems on 10 multiple shooting intervals with the IPQPT 
tolerance set to 10^'^. Again a selection of initial values for the full system was used. An 
overview is given in table [2] Note, that this selection leads to a set of two initial values for the 
reduced system, namely xa — (—10,0)"^ and xq — (—10,10)"^. The initial values for the state 
variables at the multiple shooting nodes are obtained through integrating the ode system with 
the initial control u{t) = 0. Bounds are introduced as follows: yi £ [—20,20], y2 G [10,50], 
2/3, y4, 2/5 e [-10^,108], and u e [-15,15]. 

Using the control computed from the reduced problem for the full problem leads to extremely 
large objective values in comparison and thus renders the model reduction unusable in this case. 
This can also be seen in figures [3] and [4] which compares the full system with the reduced system 
solved with the online evaluation of h{x,u). Additionally there is no runtime advantage: The 
full system needs on average 1.1s compared to the offline approach which needs 1.4 s which is 
also the median timing of the online method. 

If we increase the spectral gap by setting e — 2 ■ 10^"^ the results are much more favorable. 
First of all the computed input from the reduced model is very close to solution of the full 
problem. Therefore also the objective values are similar. See figures |5] and [6] for an example. 
With the reduced model we find a significant computational advantage, as documented in table 
[3] The runtime for the full problem depends strongly on the initial values and varies between 
2.3 s to 5.1 s which is between 5 to 10 times slower compared to the fastest solution of the reduced 
problem with the offline method. The online approach is around 2 to 3 times faster. A further 
advantage worth mentioning in this context is that the time needed for the reduced problem is 



14 



Table 2: Final objective values of problems ( 13 1 and ( 14 1 for a selection initial values and e = 0.2. 



Xo 



(131 (14), online (14 1, offline 



(-10,0,0,0,0) 


32.9 


35.1 


34.9 


(-10,0,10,0,10) 


25.1 


521.7 


612.7 


(-10,0,0,10,10) 


24.7 


648.0 


750.1 


(-10,10,10,10,10) 


20.4 


782.6 


830.3 


(-10,10,0,0,10) 


20.5 


521.4 


559.4 


(-10,10,10,0,10) 


20.0 


579.1 


619.5 




Figure 3: Example trajectories using the con- Figure 4: Example controls computed from the 
trol from the full problem (13 1, subscript / and full problem (131, subscript / and the reduced 



the reduced problem (141, subscript r with e — 
0.2. 



problem (14 1, subscript r with e — 0.2. 



15 



0.5 



1.5 



0.5 



1.5 



Figure 5: Example trajectories using the con- 
trol from the full problem fTsl, subscript / and 



the reduced problem (14 1, subscript r with e 
2 X 10"^ 



Figure 6: Example controls computed from the 
full problem ( |13[ ), subscript / and the reduced 
problem (14 1, subscript r with e — 2x 10~^. 



Table 3: Summary of various statistics concerning the solution of problem ( 13 1 and ( 14 1 for e 
0.2 X 10~^. Timings are averages over all initial values. 



Problem 



Time NLP iter Time per iter 



(13 
(12 
(141 
(14 



3.7s 37.7 

online 1.6 s 16.5 

offline (median) 1.3 s 18 

offline (best) 0.5 s 16 



0.1s 
0.1s 
0.07s 
0.03 s 



much less dependent on the initial values, which makes the computation more reliable in online 
control scenarios where the next input has to be computed within a given time frame. 

As in the enzyme example we systematically tried various parameter combinations (points 
per dimension, overlap and points per patch) for the interpolator. We already mentioned the best 
and median runtime values, however it should also be noted that bad parameter combinations 
can decrease the algorithmic performance significantly. The maximum time needed for both e 
values and sets of initial values was over 16 s. In a considerable number of cases the problem 
could not even be solved. This shows that the interpolator approach has to be tuned carefully 
but further analysis reveals that at least in this case the best configuration is the same for both 
initial values and e. 



7 Concluding Remarks 

Given an optimal control problem, the aim of model reduction is to determine and solve a smaller 
problem and use its solution as input to the full scale problem hoping for computational benefits 
while still obtaining a feasible and nearly optimal control. Our approach to reduce the dimension 
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of the state space is based on time scale separation, i.e. processes evolving on slow and fast time 
scales within the same system. According to |17l[^ 1^150] we formulate a nonlinear optimization 
problem that identifies a slow manifold in the state space, parametrized by the slow states. This 
manifold hence defines a map y = h(x, u) of the slow variables and control onto the fast variables 
and can be used to reduce the dynamic system. 

Singular perturbation theory delivers a framework for optimal control problems involving 
fast/slow differential systems. Using the Pontryagin minimum principle one arrives at a singu- 
larly perturbed boundary value problem. Its solution consists of three components: A slowly 
varying part which represents the system confined to the slow manifold and two fast vanishing 
boundary layer corrections. Because we use an approximation to the slow manifold to represent 
the fast states and thereby reduce the dimension of the state space we are only able to obtain 
an approximation to the slow part of the optimal control solution. Two examples, both of which 
are singularly perturbed systems, are used to illustrate that if the boundary layer corrections 
are small, the solution of the reduced system can produce controls that drive the full system in 
a nearly optimal fashion. Solving the reduced problem is up to ten times faster if the offline, 
interpolation based method is used for the manifold map h. The numerical scheme presented 
here can be used unmodified to solve general nonlinear optimal control problems, i.e. problems 
not explicitly in singular perturbed form. 

If larger systems, especially with more slow states are considered the interpolation approach 
will suffer from the curse of dimensionality because of the exponentially growing number of nodes 
and with it also interpolation data that has to be handled. This problem could to a certain degree 
be overcome by using a more suited data structure (e.g. kd-trees) and by using a tight, problem 
specific state space and control domain and take advantage of the ability of using scattered nodes. 

The online method is not subject to the dimensionality problem, however its evaluation for 
one input point takes considerably longer since the full system has to be evaluated in the general 
Gaufi-Newton method. Still the benefit from reduced stiffness can speed up the overall solution 
of the optimal control problem. A tight integration into the ODE integration routine, similar to 
a DAE solver would greatly decrease unnecessary overhead and increase speed and stability. 

References 

[1] A. Astolfi. Model reduction by moment matching for linear and nonlinear systems. Auto- 
matic Control, IEEE Transactions on, 55(10) :2321 -2336, oct. 2010. 

[2] Bradley M. Bell. Automatic differentiation software cppad., 2010. 

[3] Hans Georg Bock and Karl J. Plitt. A multiple shooting algorithm for direct solution of 
optimal control problems. In Proceedings of the Ninth IFAC World Congress, Budapest. 
Pergamon, Oxford, 1984. 

[4] AE Bryson and YC Ho. Applied Optimal Control: Optimization, Estimation and Control. 
Taylor and Francis, London, 1975. 

[5] M. Diehl, H. G. Bock, J. P. Schloder, R. Findeisen, Z. Nagy, and F. AUgower. Real-time 
optimization and nonlinear model predictive control of processes governed by differential- 
algebraic equations. Journal of Process Control, 12(4):577-585, 2002. 

[6] M. Dmitriev and G. Kurina. Singular perturbations in control problems. Automation and 
Remote Control, 67(1): 1-43, January 2006. 



17 



[7] Neil Fenichel. Geometric singular perturbation theory for ordinary differential equations. 
Journal of Differential Equations, 31:53-98, 1979. 

[8] Rolf Findeisen, Lars Imsland, Frank Allgower, and Bjarne A. Foss. State and output feed- 
back nonlinear model predictive control: An overview. European Journal of Control, 9(2- 
3):190~207, 2003. 

[9] K. Fujimoto and J. Schcrpcn. Balanced realization and model order reduction for nonlinear 
systems based on singular value analysis. SIAM Journal on Control and Optimization, 
48(7):4591-4623, 2010. 

[10] Juergen Hahn and Thomas F. Edgar. An improved method for nonlinear model reduction 
using balancing of empirical gramians. Computers & Chemical Engineering, 26(10):1379 - 
1397, 2002. 

[11] Frank Hoppensteadt. Properties of solutions of ordinary differential equations with small 
parameters. Communications on Pure and Applied Mathematics, 24:807-840, 1971. 

[12] Hans G. Kaper and Tasso Joost Kaper. Asymptotic analysis of two reduction methods for 
systems of chemical reactions. Physica D, 165:66-93, 2002. 

[13] Gaetan Kerschen, Jean-Claude Golinval, Alexander F. Vakakis, and Lawrence A. Bergman. 

The method of proper orthogonal decomposition for dynamical characterization and order 
reduction of mechanical systems: An overview. Nonlinear Dynamics, 41:147-169, 2005. 
10.1007/sll071-005-2803-2. 

[14] Petar V. Kokotovic. Applications of singular perturbation techniques to control problems. 
SIAM Review, 26(4):501-550, 1984. 

[15] Karl Kunisch and Stefan Volkwein. Proper orthogonal decomposition for optimality systems. 
ESAIM: Mathematical Modelling and Numerical Analysis, 42(01):l-23, 2008. 

[16] Sanjay Lall, Jerrold E. Marsdcn, and Sonja Glavaski. A subspace approach to balanced 
truncation for model reduction of nonlinear control systems. International Journal of Robust 
and Nonlinear Control, 12(6):519-535, 2002. 

[17] Dirk Lebiedz. Computing minimal entropy production trajectories: An approach to model 
reduction in chemical kinetics. Journal of Chemical Physics, 120(15) :6890-6897, April 2004. 

[18] Dirk Lebiedz. .Julia Kammcrcr, and Ulrich Brandt-PoUmann. Automatic network cou- 
pHng analysis for dynamical systems based on detailed kinetic models. Physical Review 
E, 72(4):041911, October 2005. 

[19] Dirk Lebiedz, Volkmar Reinhardt, and Jochen Siehr. Minimal curvature trajectories: Rie- 
mannian geometry concepts for slow manifold computation in chemical kinetics. Journal of 
Computational Physics, 229(18) :6512-6533, September 2010. 

[20] Dirk Lebiedz and Jochen Siehr. A continuation method for the efficient solution of para- 
metric optimization problems in kinetic model reduction. arXiv:1301.5815, 2013. 

[21] Dirk Lebiedz, Jochen Siehr, and Jonas Unger. A variational principle for computing slow in- 
variant manifolds in dissipative dynamical systems. SIAM Journal on Scientific Computing, 
33(2):703-720, 2011. 



18 



[22] R. Marino and P.V. Kokotovic. A geometric approach to nonlinear singularly perturbed 
control systems. Automatica, 24(1) :31 - 41, 1988. 

[23] W. Marquardt. Nonlinear model reduction for optimization based control of transient chem- 
ical processes. In J.W. Eaton J. B. Rawlings, B.A. Ogunnaike, editor, Chemical Process 
Control VI, Tuscon, Arizona, 7-12.1.2001, number 326, pages 12-42, 2002. 

[24] J. D. Murray. Mathematical Biology I: An Introduction. Springer, 1993. 

[25] D. Subbaram Naidu. Singular Perturbation Methodology in Control Systems. Institution of 

Engineering and Technology, 1988. 

[26] D. Subbaram Naidu. Singular perturbations and time scales in control theory and appli- 
cations: an overview. Dynamics of Continuous, Discrete and Impulsive Systems, Series B: 
Applications and Algorithms, 9:233-278, 2002. 

[27] R. O'Malley. Singular perturbations and optimal control. In W. Coppel, editor. Mathemat- 
ical Control Theory, volume 680 of Lecture Notes in Mathematics, pages 170-218. Springer 
Berlin / Heidelberg, 1978. 10.1007/BFb0065317. 

[28] Volkmar Reinhardt, Miriam Winckler, and Dirk Lebiedz. Approximation of slow attracting 
manifolds in chemical kinetics by trajectory-based optimization approaches. Journal of 
Physical Chemistry A, 112(8):1712-1718, 2008. 

[29] Shmuel Rippa. An algorithm for selecting a good value for the parameter c in radial 
basis function interpolation. Advances in Computational Mathematics, 11:193-210, 1999. 
10. 1023 /A:1018975909870. 

[30] Jochen Siehr. Numerical optimization methods within a continuation strategy for the reduc- 
tion of chemical combustion models. PhD thesis, Ruprecht-Karls-Universitat Heidelberg, 

Heidelberg, Germany, 2012. 

[31] Jochen Siehr and Dirk Lebiedz. An optimization approach to kinetic model reduction for 
combustion chemistry. Flow, Turbulence and Combustion, page submitted, 2012. 

[32] Dominik Skanda. Robust optimal experimental design for model discrimination of kinetic 
ODE systems. PhD thesis. University of Freiburg, Freiburg im Breisgau, Germany, 2012. 

[33] Ireneusz Tobor, Patrick Renter, and Ghristophe Schlick. Efficient reconstruction of large 
scattered geometric datasets using the partition of unity and radial basis functions. In 

WSCG, pages 467 474, 2004. 

[34] A. B Vasil'eva and M. G. Dmitriev. Singular perturbations in optimal control problems. 
Journal of Mathematical Sciences, 34:1579-1629, 1986. 

[35] Andreas Wachter and Lorenz T. Biegler. On the implementation of a primal-dual inte- 
rior point filter line search algorithm for large-scale nonlinear programming. Mathematical 

Programming, 106(1):25 57, 2006. 

[36] Holger Wendland. Scattered Data Approximation. Cambridge University Press, 2005. 

[37] Antonios Zagaris, C. William Gear, Tasso Joost Kaper, and Yannis G. Kevrekidis. Analysis 
of the accuracy and convergence of equation-free projection to a slow manifold. ESAIM: 
Mathematical Modelling and Numerical Analysis, 43(4):757-784, 2009. 



19 



